Joint Versus Independent Phonological Feature Models within CRF Phone Recognition
نویسندگان
چکیده
We compare the effect of joint modeling of phonological features to independent feature detectors in a Conditional Random Fields framework. Joint modeling of features is achieved by deriving phonological feature posteriors from the posterior probabilities of the phonemes. We find that joint modeling provides superior performance to the independent models on the TIMIT phone recognition task. We explore the effects of varying relationships between phonological features, and suggest that in an ASR system, phonological features should be handled as correlated, rather than independent.
منابع مشابه
Investigations into phonological attribute classifier representations for CRF phone recognition
Classifier combination has long been a staple for improving robustness of ASR systems; we present an experiment where introducing phonological feature scores from another lab’s system [1] into our system gives a statistically significant improvement in Conditional Random Field-based TIMIT phone recognition, despite a standalone system based on their features performing significantly worse. The ...
متن کاملA Study on the Use of Conditional Random Fields for Automatic Speech Recognition
Current state of the art systems for Automatic Speech Recognition (ASR) use statistical modeling techniques such as Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) to recognize spoken language. These techniques make use of statistics derived from the acoustic frequencies of the speech signal. In recent years, interest has been rising in the use of phonological features derived fr...
متن کاملDynamic evidence models in a DBN phone recognizer
This paper describes an implementation of a discriminative acoustical model – a Conditional Random Field (CRF) – within a Dynamic Bayes Net (DBN) formulation of a Hierarchic Hidden Markov Model (HHMM) phone recognizer. This CRF-DBN topology accounts for phone transition dynamics in conditional probability distributions over random variables associated with observed evidence, and therefore has l...
متن کاملOrganizing phone models based on piecewise linear segment lattices of speech samples
Aiming at robust speech recognition, we have proposed a framework for “phonological concept formation,” which is the task of acquiring an efficient representation of phonemes from spoken word samples without using any transcriptions except for the lexical classification of the words. In order to implement this task, we propose the “piecewise linear segment lattice (PLSL)” model for phoneme repr...
متن کاملExtracting phonological chunks based on piecewise linear segment lattices
The task of our research is to form phone-like models and a phoneme-like set from spoken word samples without using any transcriptions except for the lexical identi cation of each word in a vocabulary. This framework is derived from two motivations: 1) automatic design of optimal speech recognition units and structures of phone models, and 2) multi-lingual speech recognition based on languagein...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007